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Abstract 

We argue that C. Darwin and more recently W. Hennig worked at times under the simplifying 
assumption of an eternal biosphere. So motivated, we explicitly consider the consequences which follow 
mathematically from this assumption, and the infinite graphs it leads to. This assumption admits 
certain clusters of organisms which have some ideal theoretical properties of species, shining some light 
onto the species problem. We prove a dualization of a law of T.A. Knight and C. Darwin, and sketch a 
decomposition result involving the internodons of D. Kornet, J. Metz and H. Schellinx. A further goal 
of this paper is to respond to B. Sturmfels' question, "Can biology lead to new theorems?" 
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1 Introduction 

Dress et al. (2010) recently renewed interest in the set of all organisms ever to have lived, endowed with a 
directed graph structure. We thought it a natural extension to consider the set V of organisms (or other 
living things) which have ever lived or which will ever live in the future. We found that this extension had 
already been made explicit in Kornet (1993) and Kornet et al. (1995), where the future is needed in order 
to distinguish between temporary and permanent splits in genealogical networks. 

1.1 Darwin as Infinite Graph Theorist 

With our future-oriented view of systematics in mind, we recalled that particularly relevant passage from 
C. Darwin's On the Origin of Species: 

The Knight-Darwin Law. (Darwin 1872) "...it is a general law of nature that no organic being self- 
fertilises itself for a perpetuity of generations; but that a cross with another individual is occasionally- 
pcrhaps at long intervals of time- indispensable." 

In (Francis Darwin 1898) we learn that the above principle is called the Knight-Darwin Law. But in a 
world where life goes extinct in finite time, this law is trivial: even if all reproduction in the entire world were 
asexual, the Knight-Darwin Law would still trivially hold for lack of an instance of "perpetual" (infinite) 
self-fertilisation to falsify it. 

Thus not only is Darwin concerned with the distant fu- 
ture of life, he seems to admit the possibility of an infinite 
biosphere- the possibility that life will never go extinct- 
because if he explicitly thought this impossible, then there 
would be no point suggesting the above Law. 

Based on a careful reading of surrounding passages, we 
believe the Knight-Darwin Law can be glossed in modern 
graph-theoretical language as follows (hereafter, G is the 
graph of all living organisms, past and future, an arc di- 
rected from utou precisely if u is a parent of v): 

The Knight-Darwin Law. (Graph-theoretical version) 
The graph G docs not contain an infinite directed path of 
vertices each of which has < 2 parents. 

Darwin did know of apparent counterexamples (see Fran- 
cis Darwin 1898), reducing the Knight-Darwin Law to a sim- 
plifying assumption; nevertheless, a simplifying assumption 
of remarkable graph-theoretical sophistication for its time. 
One might argue that infinite graphs were not well studied 
outside of cutting-edge research mathematics until (Konig 
1936), many decades after Darwin's work. 

In a later section we will prove that if true, the Knight- 
Darwin Law logically implies a Dual Knight-Darwin Law. 

1.2 Hennig as Infinite Graph Theorist 

There is compelling reason to suspect the Figure 4 on p. 19 
of W. Hennig's (1966) book (see our Figure 1) implicitly depicts an infinite graph- joining Hennig with 
Darwin in infinite graph theory. 

The figure in question illustrates a cleavage in a biological network. Hennig refers (in his caption) to "the 
process of species cleavage" (emph. mine), suggesting a dynamic continuation of the genealogical network 
(graph) splitting up. Thus (we believe) Hennig must intend the population to continue beyond what is 
shown (what is shown is too ephemeral to be a process). A finite continuation would have the same problem, 
thus suggesting an infinite intended extension. In the text accompanying the figure, Hennig speaks of species 




Figure 1: A reproduction of a graph in 
W. Hennig's (1966) book (Hennig's Figure 
4, p. 19). If we understand correctly, the 
triangle denotes the permanency of the cor- 
responding cleavage; this triangle would be 
superfluous if Hennig did not intend us to 
imagine the graph continuing beyond what 
is shown. 
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which "persist over long periods of time" but which are "not absolutely permanent" . Like the Knight-Darwin 
Law, there would be no reason to even mention absolute permanence if Hennig assumed a finite end to the 
biosphere. 

Furthermore, Hennig seems to distinguish between one particular cleavage (indicated by a triangle) and 
smaller cleavages also present in the graph. Surely the distinction cannot be as arbitrarjjj as "branch size 
26" versus "branch size 1" . Hennig discounted the minor cleavages because he meant us to imagine them 
as temporary- they would disappear if we just saw a little more of the graph. But any finite number of 
additional generations would still suffer minor cleavage^] and the only way to eliminate them is to continue 
the graph infinitely far. 

1.3 An Application to the Species Problem 

The species problem is the problem of finding a proper definition for the intuitive concept of a biological 
species. Many species notions exist, each with its own pros and cons, and none have managed to reach 
universal acceptance. 

By studying G with an explicit infinitude assumption, we have arrived at some cluster notions (described 
in Sections 3 and 4) which we have decided to call infinitary genera and infinitary species. These name 
choices will be justified below; infinitary genus is more or less an arbitrary name and could be replaced by 
infinitary family or infinitary order or infinitary tree- of -life-node; however, the name infinitary species is 
important. 

We won't pretend that our species notion is like any species notion used by everyday biologists; it would 
be useless in the field, and we would never dream of suggesting it as a replacement for the practical species 
notions. However, we hope our notion will apply to the species problem in three ways: by offering a solution 
to a species sorites paradox; by theoretically reconciling competing morphological and non-morphological 
approaches to species definition; and by telling us something about the structure of the far future extremities 
of the tree of life. 

2 Some Further Justification for Infinitary Assumptions 

"So profound is our ignorance, and so high our presumption, that we marvel when we hear of the 
extinction of an organic being; and as we do not see the cause, we invoke cataclysms to desolate 
the world, or invent laws on the duration of the forms of life!" -Charles Darwin (1872) 

In assuming that infinitely many individuals will live, we are hardly the first scientists to approximate 
the finite by the infinite. The assumption is very similar to how physicists and chemists study symmetry 
groups of crystal patterns or tilings. We assume such patterns fill space to infinity, because otherwise it 
would be very unnatural, if possible at all, to sensibly talk about their translation symmetries- we wonder 
whether this might somewhat explain, by analogy, why species are so difficult to define. 

2.1 Justification by Pursuit of New Mathematics 

By assuming infinity we will obtain somewhat unique responses to Sturmfels' (2005) question, "Can biology 
lead to new theorems?" If biology is to be "mathematics' next physics, only better" (Cohen 2004) it must 
generously contribute to the combinatorial and infininitary branches of mathematics, as physics has done. 

One way to distinguish mathematical theorems is on the strength of the set-theoretical assumptions which 
they require. To be sure, biology has already contributed much to mathematics, but the author is unaware 
of any theorems from biology which hinge on the Axiom of Choice; we will exhibit such a theorem in Section 
4 (Theorem 

1 Baum & Shaw (1995, p. 294) and Velasco (2008, p. 868) point out the problem of big cleavages and small cleavages, and 
the resulting vagueness. This vagueness disappears under infinitary assumptions. 

2 Unless something unnatural occurred, for example, all the living specimens of each species joining to co-parent a single 
sterile child. 
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2.2 Reduced Dependence on Scale 

There seems to be some disagreement on whether the systematist ought to focus on individual living organ- 
isms, or whether to zoom out and consider larger populations as atoms; see, for example, the back-and-forth 
between de Queiroz & Donoghuc (1988) (pro-individual organisms) and Nixon & Wheeler (1990) (the oppo- 
site). 

If the biosphere is finite, this scaling decision has a big effect on the shape of the biosphere. If one 
systematist takes organisms as atoms, and another takes some populations approximating species, the sizes 
of the resulting biospheres differ quite a bit. 

On the other hand, if both systematists operate under the assumption of an infinite tree of life, their 
decisions do not effect the shape of the biopshere: viewed through either lens, the biosphere is infinite. 

Note that this scaling decision can go both ways. Rather than considering individual organisms as the 
vertices of our graph, we could zoom inward and consider (say) individual X-chromosomes as vertices. In a 
finite world, this decision would massively blow up our graph G, but under infinitary assumptions, it makes 
no major difference. 

2.3 A word to the most hardcore finitist 

I hope my paper will be useful to you, too. In later sections there are results (e.g. Proposition 6) which are, 
I think, counter-intuitive enough that you might be able to get away with calling them paradoxes, and using 
them to advance the finitist argument by reductio ad absurdum, in the same way a critic of the Axiom of 
Choice might use the Banach-Tarski paradox. In other words, feel free to read the paper as a set of theorems 
specifically intended to disprove infinitary assumptions. 

3 Infinitary Genera 

In this and the next two sections, we will exhibit some interesting cluster notions which arise from the 
infinite biosphere assumption along with some additional assumptions which we consider less scandalous. 
The precise assumptions we make are as follows. 

• (Al) We assume (like Dress et al. (2010)) there are only finitely many roots, that is, parentless indi- 
viduals. 

• (A2) We assume no individual is a parent of infinitely many children. 

• (A3) We assume each vertex v G V has a birthdate t(v) G R, that t(u) < t(v) whenever u is a parent 
of v, and that for every real x, {v G V : t(v) < x} (the set of individuals born before time x) is finite. 

• (A4) We assume G is infinite, that is, infinitely many individuals will live. 

We will define an infinitary genus to be an infinite set of individuals which is closed under ancestry, that 
is, which contains every ancestor of every member of itself. But it would be rash to make this definition 
without first motivating it. In mathematics, it is a sign of a notion's importance when it arises unexpectedly 
from seemingly-unrelated competing notions. We will arrive at our infinitary genus notion in precisely this 
way. 

To be clear, we use the word genus arbitrarily where any of family, order, or tree- of -life-node would work 
just as well. Mathematical language is great for speaking about things which are maximal (e.g. the entire 
biosphere) and things which are minimal (e.g. infinitary species, as we'll see in the next section) but not at 
distinguishing between different intermediate levels (e.g. genera vs. families). 

With the above paragraphs in mind, let us make the following attempt at defining a group of individuals. 
We will define what we call a birthdate genus. Our hope is that this is particularly obvious, something the 
reader could easily invent on their own. We will then prove it to be equivalent to our desired infinitary genus 
notion, in sight of Assumptions A1-A4. Thus the following definition will not itself feature prominently in 
the rest of the paper, its purpose is to motivate a definition to come after. 
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Definition 1. By a birthdate genus I mean a set S of individuals such that there is some time t such that 
every member of S has birthdate > t and every external ancestor (that is, every individual outside S but 
with a descendant in S) has birthdate < t. 

As a clustering notion, this is a weakened version of the much stronger notion which Dress et al. call the 
Apresjan cluster, following Steel (2007). It captures the intuition that a node in the Tree of Life must have 
evolved into existence at some particular time, and thus, all its members are at least that young, while every 
external ancestor is strictly older. 

Convention 2. If X and Y are infinite sets, we say X is almost equal to Y (and write X & Y) if their 
symmetric difference {X — Y) U (Y — X) is finite. 

The whole reason to define birthdate genera was to allow the following motivating result: 

Proposition 1. If S is an infinite set of individuals, then the following two conditions are equivalent: 

1. S is almost a birthdate genus. 

2. S is almost its own ancestral closure. 

Proof. Let S be the ancestral closure of 5*. 

(1 2) Assume S is almost a birthdate genus. Thus, there is a birthdate genus S' such that S ~ S'. 
There is some time t such that all members of 5" are born at time > t and all external ancestors of S' are 
born at time < t. 

Now, we claim that S — S is finite. Suppose u G S — S. Since u G S, u is an ancestor of an individual 
ueS. There are three cases: 

• Case 1: u G S" — S . Since S' — S is finite, this can only occur for finitely many u. 

• Case 2: u $ S' — S and v £ S'. Then u is an external ancestor of S 1 , so is born before time t. By 
Assumption A3, this can only occur for finitely many u. 

• Case 3: u G" S' — S and v ^ S' . Since v G S, v G S — S'. Since S — S' is finite, there are only finitely 
many possibilities for v, and Assumption A3 implies that those finitely many possibilities only have 
finitely many ancestors, so again, Case 3 can only occur for finitely many u. 

This shows S — S is finite. Since S — S — is also finite, S « S as desired. 

(2 1) Conversely, assume S « S. Thus, there are only finitely many external ancestors of S, call them 
S\. Of these finitely many individuals, let t be the latest birthdate which occurs. By Assumption A3, only 
finitely many individuals were ever born as of time t, call them Sq. We claim S — So is a birthdate genus. 
Certainly any member of S — Sq is born after time t, by definition of Sq. Now suppose v G" S — Sq has a 
descendant in S — Sq. It could be v G <So, in which case i>'s birth date is no later than t, by definition of So. 
Otherwise, v must be outside of S, and thus, having a descendant in S, v is in Si, so that (by choice of t), 
again v is born no later than t. □ 

If we ignore finite sets as being insignificant, and treat infinite birthdate genera as being identical to sets 
which are almost infinite birthdate genera, we arrive (via Proposition [1} at a much simpler definition, and 
one which is easier to work with, at the price of an error which is infinitesimal. 

Definition 3. An infinitary genus is an infinite, ancestrally closed set of individuals. 

Right away we would like to demonstrate how closely interwoven Definition [3] is with the point-set 
topology of cladistics: further evidence of the legitimacy of the definition and its worthiness of study. 

Proposition 2. An infinite set S is an infinitary genus if and only if it is a closed set in the topology on V 
where clades are the basic open sets (by a clade of course we mean an individual and its descendants) . 
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Proof. In other words, we must show that for infinite S, S is an infinitary genus if and only if its complement 
S c is a union of clades. 

(=>•) Assume S is an infinitary genus. Let v G S c . Since S is ancestrally closed, v cannot be an ancestor 
of any element of S. Thus, the clade C v , consisting of v and all its descendants, is disjoint from S. Therefore 
S c is the union 

S c = |J C v 

veS" 

of clades. 

(^=) Assume S c is a union of clades and let v G S have an ancestor u; we must show u G S 1 . If u were not 
in S 1 , it would be in S 10 , hence in some clade entirely contained in S c . But v would be in that clade, forcing 
v G S c , a contradiction. □ 



4 Infinitary Species 

The motivation for our species notion is the conviction that species should be the smallest taxa- below 
genera, families, orders and so on. Thus, we will define an infinitary species to be an infinitary genus in 
which no proper subset is an infinitary genus. 

But why does such an obvious-seeming approach work for us when it has not worked for the finitist? If 
a set of organisms constitutes a species, then presumably it would still constitute a species if one specimen 
were eliminated. According to our minimalist approach, if we could throw that specimen out of the species 
and still have a species, then we must do so. In a finite world, this repeated act of discarding specimens 
would render the species empty. This is an instance of the sorites paradox, or paradox of the heap (Hyde 
2011). However, there is a way out of the trap: if we require our species to be infinite and ancestrally closed, 
the premise that we can always discard one specimen and retain a species gains a caveat: we cannot discard 
the specimen if it has so many descendants that doing so would render the species finite. We shall see that 
this causes the sorites paradox to vanish. 

Definition 4. An infinitary species (or more simply an inspecies) is an infinitary genus in which no strictly 
smaller infinitary genus is a subset. 

A priori, there might be no inspecies- either be- 
cause of the sorites paradox, or because infinitary gen- 
era can be refined further and further with no end. 
The main technical theorem of this paper is: 

Theorem 3. Given assumptions A1-A4, there is at 
least one inspecies. 

The proof will use the Axiom of Choice, in its 
Zorn's Lemma (Zorn 1935, Gowers 2008) forrrH We 
will briefly review Zorn's Lemma before proving the 
theorem (we state the lemma in a form most suitable 
for the use to which we will apply it). A reader will- 
ing to take our word for it can safely skip the proof, 
but bear in mind that if any one of Al, A2, A3, or 
A4 were denied, the other three would not imply the 
theorem. 

Recall that a binary relation D on a set SE is a 
partial order if it is reflexive, transitive, and anti- 
symmetric (anti-symmetry means that if X D Y and 
Y D X then X — Y). A family (Zi) ieI of elements 
of S£ ', indexed by a linear order /, is a ~D -chain if 

3 By A3, the biosphere is countable; it can be shown that only the countable axiom of choice is needed to guarantee existence 
of inspecies. 




Figure 2: A species (shaded) nested within 
a larger genus, within a larger order, within 
a larger family, and so on. 
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Zi D Zj whenever i < j. A bound for this chain is an element Zfl' such that Z> Z for every ie/. An 
element X G 56 is extremal if the only l^el" such that A D y is Y — X itself. 

Theorem 4. (Zorn's Lemma) Let D be a partial order on a nonempty set 56 ' . Suppose that every nonempty 
D-chain has a bound. Then 56 has an extremal element. 

Armed with this mathematical logical sledgehammer, we can prove Theorem [3J 

Proof of Theorem^ We may assume the following (*): that every non-root individual is a descendant of 
every root. This is safe to assume because if it is untrue, we can (thanks to Al) add an imaginary "super 
root" to our graph and declare it to be the lone parent of all actual roots. 

Let 56 be the set of all infinitary genera. Notice 56 ^ since it contains V itself: the set of all individuals 
is an infinitary genus by Assumption A4. Notice that the superset relation D partially orders 56, and an 
infinitary genus X G 56 is an inspecies precisely if it is extremal with respect to 3. Therefore, by Zorn's 
Lemma, it is sufficient to let (Zi)i £ i be an arbitrary nonempty D-chain from 56 and show it has a bound 
in 56 . We will show that Z = D^iZi is such a bound. Obviously for every j G /, Zj 3 Di^iZi, so all that 
remains is to show n^/Z^ is in 56 (i.e., that it is an infinitary genus). 

First the easy part: we show that Hi^iZi is ancestrally closed. Let v 6 D^iZi and let u be an ancestor 
of v. For every i £ /, we have v G Zi, and Zi is ancestrally closed, so u G by arbitrariness of i, this shows 
u G Hi^iZi, establishing ancestral closure of Z. 

The difficult part is to show that Z is infinite. Assume, for sake of contradiction, that Z is finite. 

Let q\, . . . , q m be those individuals who have a parent in Z but who are not in Z themselves: there are 
finitely many of these by Assumption A2. For any 1 < k < m, the fact that q^ g" Z = D^iZi means there 
is some ik G / such that qt g" Zi k , and thus (since (2j)j e / is a chain) more strongly q^ g" Zi whenever i > ik- 
Therefore, letting i = maxjii, . . . , i m }, we have: qi g 1 Zi, q2 G" Zi, . . ., q rn g" Zi. 

Now, Zi is infinite since it's an infinitary genus. Thus Zi — Z is infinite since Z is finite. In particular, 
Zi — Z is nonempty. Thus, we can pick an individual r G Zi — Z with shortest possible reverse-path to a root 
(every individual has a reverse-path to a root by Assumption A3). By (*), Z contains all roots, so r is not 
itself a root, and that shortest path is not the empty path. So r has a parent on that minimal-length path. 
This parent must be in Z, because otherwise, we would have chosen the parent instead of r (the parent has 
a shorter path to a root). 

I've shown that r has a parent in Z, but r itself was chosen outside Z . By definition this means r is one 
of the q\, . . . , q m . This is nonsense, because r G Zi and all of the qi, . . . , q m are absent from Z%. 

By contradiction, Z is infinite, and so (being ancestrally closed) it is an infinitary genus. I've shown that 
an arbitrary nonempty chain has a bound; by Zorn's Lemma, 56 has an extremal element, i.e., there is an 
inspecies. □ 



5 Results 

The key to the structural properties of inspecies is the following result, remarkable for being so powerful 
while having such a simple proof. 

Proposition 5. Suppose C is an inspecies and S C C is any infinite subset. Then every individual in C 
has a descendant in S. 

Proof. Let C be the set of ancestors of individuals in S. Then C is clearly ancestrally closed; since S is 
infinite, Assumption A2 implies C is infinite; altogether, C is an infinitary genus. And C C C since C is 
ancestrally closed, so C = C by minimality. □ 

This has remarkable consequences for individuals within inspecies. Take any property P, which may hold 
of some individuals and not hold of other individuals. Within an inspecies, every individual has a descendant 
with property P, or every individual has a descendant without property P (or both). For instance, in an 
inspecies everybody has a vertebrate descendant, or everybody has an invertebrate descendant. If there is 
a fixed universal upper bound on the number of hairs an individual can have on their body, then for any 
inspecies, there is some number n such that everybody in the inspecies has a descendant with exactly n hairs 
on their body. If we let Bertrand Russell choose the property P, he would no doubt choose the property 
"not a descendant of Bertrand Russell" and thereby lead us to: 
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Proposition 6. In an inspecies, every individual is an ancestor of almost every individual. 

Proof. Let u be a member of an inspecies and let S be the set of members of that inspecies who are not 
descended from u. If S were infinite, then by Proposition^ u would have a descendant in S, which is absurd. 
So S is finite, meaning almost every member of the inspecies descends from u. □ 

With Proposition[6]in our arsenal we can productively compare the inspecies with the tight cluster notion 
of Dress et al. If C is an inspecies and we let D(D~ C) consist of all individuals in V whose descendants 
include almost all of C, then Proposition |6] implies D(d~ C) = C. Therefore, every inspecies is a kind of 
one-sided version of a tight cluster (it is one-sided because the clusters of Dress et al. (2010) are designed to 
be closed descendantially (at least when unborn future individuals are ignored) and ours are not). A similar 
observation goes for Baum's (2009) concept of organismic exclusivity (as described by Dress et al.) 

The next proposition will theoretically reconcile morphological and non-morphological approaches to the 
species problem. 

Proposition 7. (Morphological Trichotomy) Let C be an inspecies and let P be a property of individuals. 
Exactly one of the following is true: 

1. P holds of almost every member of C, 

2. P fails of almost every member of C, or 

3. every member of C has both a descendant satisfying P and a descendant failing P. 

Proof. Immediate by Proposition^ if neither (1) nor (2) holds, then both the P-conformists and the P-rebels 
are infinite in number. □ 

For example, in an inspecies of birds, a property of type 1 might be "has feathers" , a property of type 2 
might be "has gills" , and a property of type 3 might be "is male" , or even "hatched between Monday and 
Thursday" . 

In our opinion, Proposition[7]theoretically reconciles morphological species notions (recognizable, ecolog- 
ical, phylogenetic, and so on) with non-morphological species concepts. Inspecies are defined solely based on 
ancestral relations with reckless disregard for any other considerations- and yet, Morphological Trichotomy 
(Proposition 7) rigorously shows that morphological aspects of species are hard to avoid. 

If we have understood him correctly, de Queiroz (2007) would say that the uniformities suggested by 
Proposition [7] are what he calls former secondary species criteria, which "can be used to define subcategories 
of the species category- that is, to recognize different classes of species" and which, therefore, we submit as 
evidence verifying we are justified in referring to infinitary species as a species notion. 

Proposition 8. If two inspecies have infinite intersection, they are equal. That is, distinct inspecies have 
almost no members in common. 

Proof. Let C\ and C2 be two inspecies with \C\ fl C2I = 00. By symmetry it's enough to show C\ C C2. 
Let v be an individual in C\ . By Proposition [3 v has a descendant in C\ D C2, hence in C2 . Since C2 is 
ancestrally closed, v <E C2. □ 

Sadly, our infinitary genus notion does not have the coveted nesting property described by Dress et al. 
On the other hand, Proposition [8] shows that (ignoring finite sets) inspecies have that property too strongly! 
By this we mean that if we copy the techniques in Dress et al. (2010) to turn a nested cluster notion into 
a forest notion, we get a degenerate forest with all vertices isolated and no arcs at all. This hints that, 
where the Tree of Life is concerned, inspecies may be the "leaves at infinity," while the clusters of Dress et 
al. (2010) or the composite species of Kornet and McAllister (2005) may be the actual nodes. 

As promised in the introduction, we will now prove that the Knight-Darwin Law implies a Dual Knight- 
Darwin Law (see Figure 3). 
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Theorem 9. Assume the Knight-Darwin Law: that 
G has no infinite path of vertices each with < 2 par- 
ents. Assume also that A1-A4 hold. Then the fol- 
lowing facts are implied: 

1. (The Dual Knight- Darwin Law) No inspecies 
has an infinite path of vertices each with < 2 
children. 

2. There are infinitely many individuals each of 
whom has multiple children. 

3. Within an inspecies, every individual has a pair 
of descendants that breed together. 




— \ t > 

Figure 3: The Knight-Darwin Law (top) 
and the Dual Knight-Darwin Law (bot- 
tom). 



Proof. (1) Let C be an inspecies, and let v\,V2, ■ ■ ■ be any infinite path in C. We must show some Vi has 

> 2 children. For sake of contradiction, assume not. By the Knight-Darwin Law, there is some Vi 1 with > 2 
parents. And then, by the Knight-Darwin Law applied to w^+i, Vi 1+ 2, ■ ■ • > there is some Vi 2 (12 > i\) with 

> 2 parents. This process continues forever: there are i\ < 12 < 13 < ■ • • such that each Vi, has > 2 parents. 
For each j > 1, let pj be a parent of Vi- different than Vi _i. All these new parents are in C by ancestral 
closure, and by Assumption A2, there are infinitely many of these new parents. By Proposition [5j one of the 
Pj is descended from v±. So u,. has at least two distinct parents, and pj, both of whom are descended 
from vi (or possibly one of them could equal vi). This implies (by acyclicity) that if we look at the path 
v\ 1 . . . ,Vi j , there must have been a fork somewhere, in order for pj to be born and breed with This 
contradicts the assumption that none of the v^s have > 2 children. 

(2) Immediate from (1) along with Theorem [3J 

(3) Let v be an individual in an inspecies C. By Proposition [51 there are only finitely many individuals 
in C not descended from v, and by Assumption A2, these finitely many non-w-descendants have only finitely 
many children. Thus, there must be some individual in C descended from v, both of whose parents are 
also descended from v. Those two parents are therefore a pair of descendants of v which breed together, as 
desired. □ 



5.1 A Response to Sturmfels 

B. Sturmfels asked (2005): "Can biology lead to new theorems?" We have self-imposed upon ourselves three 
criteria that we feel obliged to meet in order to satisfy ourselves in responding. In our opinion, in order for 
a mathematical biology paper to "Lead to new theorems," it should: 

• (Tl) Include simple and interesting theorems about some new type of system motivated by biology. 

• (T2) Demonstrate breadth, by using the new system to give original new proofs of some already known 
results (or open problems). 

• (T3) Demonstrate nontriviality, by stating a nontrivial open problem which is not too contrived. 

If we have erred in these criteria, we have attempted to err on the side of stringency. T3 could of course 
be replaced by the inclusion of a theorem with an interesting and difficult proof (e.g. a non-open problem). 

For Tl, we consider Theorem 3 and Propositions 5, 6, 7, and 8 sufficient. We have given some interesting 
theorems about new systems motivated by biology. 

For T2, we consider Theorem 9 partially satisfactory. We consider it common knowledge that a ge- 
nealogical human family tree must necessarily either keep branching (in the sense of multiple children being 
born of a parent) or keep bringing in new roots via marriage, or both, and if it stops doing so, successive 
generations will dwindle in size until extinction. Theorem 9 (part 2) formalizes this, and (part 1) (along 
with Theorem 3) generalizes it. Likewise, Theorem 9 (part 3) formalizes, strengthens, and reproves a certain 
notion that inbreeding is unavoidable. The original notion is that if we assume no inbreeding, then going 
back (say) 50 generations we should expect a human population of at least 2 50 « 10 15 by counting nothing 
but the distinct great 48 -grandparents of this author. To satisfy T2 further, we have used infinite graphs in 
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systematic biology to give an alternate proof (in (Alexander, preprint)) of a result from (Johnston 2010) 
about Conway's Life-like games. 

For T3, we state the following open problem. 

Open Problem. If s = (sq, s\, . . .) is an infinite sequence from the alphabet {M, F}, say that s is biologically 
unavoidable if in every possible gendered population conforming to A1-A4 (gendered here meaning that every 
non-root has a male parent and a female parent), there is a sequence of organisms, each a parent of the 
next, whose genders match s (the sequence does not need to begin with a root). What are the biologically 
unavoidable sequences? 

A priori, it could be that every sequence is biologically 
unavoidable. But in (Alexander, preprint) we show a coun- 
terexample (see Figure 4). It could also be, a priori, that no 
sequence is biologically unavoidable, but in the same paper, 
we show that every eventually-periodic sequence is unavoid- 
able. It remains an open question even whether there is a 
single unavoidable sequence not eventually periodic. 

Note, the result about every eventually periodic sequence 
being unavoidable could not be true if the sequences of or- 
ganisms were required to start at a root, as the following 
example shows. According to J. Diamond (1997), "In an ex- 
treme scenario the first settlers [of Australia] are pictured as 
... a single pregnant young woman carrying a male fetus" . 
Define the set of Aboriginals to consist of that Mother, her 
Child, her Child's Father, and all the joint descendants of 
the Mother and Child. Thus, the roots are exactly the Mother and the Father. Even if we assume the Abo- 
riginals satisfy the hypotheses of the Problem, there can be no sequence starting at a root and having genders 
M, F, M, F, . . ., for the simple reason that there is only one male root and he has no female Aboriginal child. 

The theorems in this section are certainly not exhaustive. We have opted to limit ourselves to theorems 
of particular interest in systematic biology. We hope they are adequate to establish infinitary genera and 
species as at least a potential candidate for a two-way bridge between biology and the more abstract and 
theoretical side of mathematics. 

6 Infinitary Species and Internodons 

There are noteworthy similarities between infininitary species and the internodal species concepts ofl Kornet 
(1993), Kornet, Metz, and Schellinx (1995), and Kornet and McAllister (2005) (we shall focus on the 1995 
paper, as it is mathematically the most straightforward). 

Informally, internodons ar^l the largest clusters subject to the constraint that permanent splits in the 
genealogical network give rise to new internodons. Thus, if (as in Figure 1) a branch in G splits permanently 
into two smaller branches, near the splitting point there are three internodons: one for the branch pre-split, 
and two for each smaller branch. 

The primary similarity between internodons and inspecies is the dependence on the future (to establish 
a split's permanence may require infinitely futuristic knowledge). Both are defined purely from G (and 
birthdates), sans morphology. Both respect permanent splits (individuals on opposite sides thereof can 
share neither an internodon nor an inspecies). And both notions arose from attempts to bridge math and 
biology: Kornet et al. attempted to derive new biology from math, and this author attempted to derive new 
math from biology. 

4 This work reflects a stepwise formalization of Hcnnig's internodal species: unavoidability of the implied permanency of 
cleavages (Kornet, 1993), formal implementation (Kornet ct al. 1995); lowering species' status, because of implied short lifespans, 
to building blocks (internodons) (Kornet et al. 1995); remedial grouping by secondary morphological criteria into composite 
species (Kornet & McAllister, 2005). The entire project was first informally printed as a PhD thesis {Reconstructing Species; 
Demarcations in Genealogical Networks, 1993, Leiden University). 

5 To quote Kornet et al. (1995), internodons are "parts of a genealogical network of individual organisms between two 
successive permanent splits or between a permanent split and an extinction event." 




Figure 4: A hypothetical population (mod- 
ified from an example of T. J. Carlson) in 
which not every gender sequence is real- 
ized. Solid vertices represent males and 
open vertices represent females. One par- 
ticular sequence absent in this population 
is M 2 FM 5 F---M 3n ~ 1 F---. 



10 



Our attempt to contrast inspecies and internodons leads to another application of infinite graphs to the 
species problem, the idea of benchmark populations. 

6.1 Benchmark Populations 

As the species problem is difficult, one strategy might be to try to solve tiny sub-problems. A benchmark 
population is a hypothetical population, together with a question which would be trivial in sight of a species 
problem solution. One such question might be, "does this population include members of multiple species, 
or just one?" If we solve the species problem, such a question should be straightforward. Til then, these 
questions are species sub-problems, which we might hope to answer before answering the full species problem. 

Interesting benchmark populations will most likely be infinite. As the following example will show, a 
good benchmark population is one which exhibits some "edge case" pathology specifically meant to push 
species notions to their limits. 

Our attempt to contrast inspecies and internodons led us to a 
particular benchmark population which we call the one-third variant 
population^] (see Figure 5). This population consists of infinitely 
many generations, each with one male, one female, and one variant. 
Each generation's male and female produce the next generation's 
male and female, and each generation's female and variant produce 
the next generation's variant. The question is: are the variants in 
the same species as the non-variants? 

Any well-defined species notion (depending only on graph- 
theoretical considerations) should either answer the above question, 
explain why the question is ill-posed, or else use assumptions about 
reality which rule out the ^-variant population. 

Internodal species notions say that variants are in the same 
species as no n- variants (there are no splits in sight). Infinitary 
species say that the variants are not the same species: in fact the 
variants are not in any inspecies at all (lest the directed path of 
variants would violate the Dual Knight-Darwin Law (Theorem |H] part 1)). 

Figure 5: The | variant popu- 

6.2 Infinitary Species (Type II) lation 

Kornet et al. (1995) suggested that internodons are species building blocks (a suggestion bearing fruit in Ko- 
rnet & McAllister (2005)). This raises one's hopes that perhaps infinitary species are unions of internodons. 
The ^-variant population dashes those hopes. 

This motivates a species notion using the same principles as inspecies but admitting decomposition into 
internodons. With the ^-variant population in mind, what must be found is a sense in which, e.g., a variant 
is an "ancestor" of later non- variants. 

Note that in general u is an ancestor of v if and only if u is older than v and G has a directed path from 
u to v avoiding vertices older than u. So call u an undirected-ancestor (or simply an undirancestor^\ of v if 
u is older than v and G has an undirected path from u to v avoiding vertices older than u. Figure 6 shows 
a small population (left) and the corresponding undirancestral relations (middle). The latter clause- that 
G contains such an undirected path- is written h(PC>„)d in Kornet et al. (1995), a notation we too shall 
adopt. This makes variants undirancestors of future non-variants in the ^-variant population. Say a set of 
individuals is undirancestrally closed if it contains all its own undirancestors. 

6 As candidates one may think of organisms with complex haploid/diploid life cycles such as social animals (ants, bees, 
termites, weevils) or the creative algae, fungi, mosses. But simpler still would be a genuine sexual network with an additional 
variant male type with a y' chromosome. Say, this variant male's sperm cells carrying the y' chromosome always outcompete 
its own sperm cells carrying the x chromosome. Then, per generation we have one (non-variant) genuine xy male (generating 
x and y sperm cells), one (non- variant) genuine xx female (generating x egg cells only), and one variant xy 1 male (generating 
y' sperm cells that always outcompete its x sperm cells). In this way a variant can only co-parent (with the genuine female) 
an xy' variant son. 

7 Thus, the undirdescendants of u are precisely the members of the gross dynasty of u (minus u and its same-exact-age peers), 
in the language of Kornet et al. (1995). 
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Figure 6: Left: A small population and its parental relations (birthdates indicated by vertical 
height). Middle: Its undirancestral relations. Right: Its undirparental relations. 

Say u is an undirected-parent (or simply undirparent) of v if u is an undirancestor of v and there is no 
w such that it is an undirancestor of w and w is an undirancestor of v (this is a general recipe for reverse- 
engineering parenthood notions from ancestorhood notions). Figure 6 shows a small population (left) and 
its undirparent relations (right). 

We'd like to apply our infinitary species machinery to undirparenthood. But first we must ensure we 
have not corrupted any of Assumptions A1-A4. 

Lemma 10. Let G' be the graph whose vertices are the organisms of G, and in which an arc is directed 
from u to v if and only if u is an undirparent of v; endow G' with the same birthdates as G. Given that G 
satisfies A1-A4, so does G' . 

Note that in the graph G' of the lemma, for any vertices it, v £ G, u is an undirparent (resp. undirancestor) 
of v in G iff u is a parent (resp. ancestor) of v in G' . 

Proof of Lemma UTh G' Satisfies Al. First we claim (*) that any vertex (say v) with a parent (say u) must 
have an undirparent. If it is an undirparent of v, this is trivial. Otherwise, since it is clearly an undirancestor 
of v, there is some w blocking u from being an undirparent of v. If w is an undirparent of v, we're done; 
if not, there is some w' blocking w from being an undirparent of v... this process must terminate, because 
(by A3 in G) there are only finitely many individuals born before v. This proves (*). The contrapositive 
of (*) says: any individual with no undirparent must have no parent. There are only finitely many such 
individuals, since G satisfies Al. 

G' satisfies A2. Let it € V, we must show u has only finitely many undirchildren. Assume, for sake of 
contradiction, u has infinitely many undirchildren. By A3, only finitely many vertices share it's birthdatc, 
and by A2 these have finitely many children, so, by A3 again, u has an underchild v younger than all 
children of all vertices with u's birthdate. Since u is an undirancestor of v, there is an undirected path 
u = Uq, . . . , u n = v avoiding vertices born before u. Let j < n be maximal such that t(uj) = t(u). We chose 
v younger than all children of Uj, so n > j + 1. Thus it makes sense to pick j < i < n such that it, is as 
old as possible among all such choices for i. Since uq, . . . , u n witnesses u(PC>„)d, it follows that uo, . . . , Ui 
witnesses u(PC> u )ui. By maximality of j, t(ui) > t(u), so this shows u is an undirancestor of Ui. Since Ui 
was chosen as old as possible among ity+i, . . . , v n -i, Ui, . . . , u n -i avoids vertices born before in. And u n = v 
is younger than m, because t(m) < t(v,j+i) (by choice of i) and t(uj+\) < t(v) (by choice of v, since itj+i is 
a child of Uj, which has u's birthdate). Thus Ui,...,u n witnesses Ui(PC> Ui )v, and since t(ui) < t(v) this 
shows Ui is an undirancestor of v. Letting w — Ui, this violates the definition of u being an undirparent of 
v, a contradiction as desired. 

G' satisfies A3. If u is an undirparent of v, then in particular u is an undirancestor of v, so by definition 
t(u) < t(v). The other part of A3 (the finiteness of {v £ V : t(v) < x}) is trivial since G' has the same 
birthdates as G. 

G' satisfies A4- G is infinite (by A4), and G' has the same vertices, so G' is infinite. □ 

One very special case is if we assume no two organisms ever share a birthdatc. 

Proposition 11. Let G' be as in Lemma [TU1 If no two vertices share the exact same birthdate, then G' is 
a forest. 

Proof. Assume no vertices share the same birthdate. To show G' is a forest, it suffices to show G' contains 
no cycles. By A3, existence of a cycle in G' implies some organism u has > 2 distinct undirparents v\ 1 V2- 
By hypothesis, t{v\) ^ t{v2), we may assume t(v\) < t(v2)- It follows v± is an undirancestor of vi (we have 
t>i(PC> Ul )i>2 via it); letting w — v-i violates the definition of V\ being an undirparent of u. □ 
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In Figure 7, we induce unique birthdates in the 
network from Hennig's Figure 4 by slightly tilting 
it; thin edges are Hennig's original parental rela- 
tions, thick edges are undirparental relations (which 
make a tree, with precisely one splitting point exactly 
where internodal speciation occurs) . Note we assume 
the minor cleavages late in the network are tempo- 
rary (i.e., that the network continues beyond what is 
shown, and minor branches rejoined later). 

By an infinitary species (type II) for G, I mean 
an infinitary species for the graph G' of Lemma [TO] 
In other words, an infinitary species (type II) is an 
infinite set of individuals closed under undirancestry 
and which cannot possibly be shrunk while preserv- 
ing these properties. With A1-A4, an inspecies (type 
II) exists by Lemma [TO] and Theorem [3] All the re- 
sults from Section [5] carry over to inspecies (type II) , 
with the prefix "undir" attached appropriately. For 
example, Proposition [5] says that within an inspecies 
(type II) , everyone is an undirancestor of almost ev- 
eryone. In case no organisms share the exact same 
birthdatc, infinitary species (type II) are simply the 
infinite branches in the forest G' . 

The following theorem relates internodon theory 
and inspecies. For the sake of clarity and length, this 
theorem is stated with extreme simplifying assump- 
tions; in future work we will generalize it to make it 
more applicable to the real world. 




Figure 7: Undirparental relations in a 
copy of Hennig's Figure 4 population, 
tilted to induce unique birthdates. We 
assume the population continues beyond 
what is shown, in particular that minor 
cleavages (not indicated by triangles) are 
temporary. 



Theorem 12. Assume that no two individuals are born simultaneously, and that every individual has at least 
one descendant which has more than one parent. Then every inspecies (type II) is a union of internodons. 

In Kornet et al. (1995), the property (above) that an individual has at least one descendant with multiple 
parents is called the SD-property (for Sexual Descendant). 

The proof of the theorem is given in Appendix A. In future work when we strengthen this theorem, the 
strengthened version will involve the Knight-Darwin Law, tying together three central themes of this paper. 



7 Discussion 

What began with an attempt to generalize a graph of Dress et al. by considering future organisms as well as 
present and past, led us to consider some existing usages of the future biosphere in systematic biology. We 
noticed that at least one of these (the Knight-Darwin Law) and probably more (Hennig's internodal species 
concept) hinge upon infinitary assumptions, leading us to consider infinite graphs in systematic biology. 
Laboring under the (simplifying) assumption that the biosphere will be infinite, we found some natural 
cluster notions with some idealized properties of species. 

Defined entirely by ancestral relations, infinitary species share an intrinsic quality with other cladistical 
or non-morphological notions. And yet, Proposition [7] shows despite this, we still enjoy certain recognizable 
qualities associated with morphological species notions. We consider this an application to the species 
problem because it demonstrates that these seemingly unreconcilable approaches can actually meet when 
viewed at sufficient scale. Methods of Dress et al. suggest (in light of Proposition [5]) that infinitary species 
play the role of leaves at the distant future extremities of the tree of life. 

The main flaws of infinitary genera and species are as follows. First, they are completely unobservable: to 
establish even one organism's membership in one particular infinitary genus, it is necessary to have knowledge 
of the infinitely distant future. Second, not all organisms are guaranteed membership in any infinitary 
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species. A childless organism is excluded from being in any infinitary species (e.g., by Proposition [SJ ; 
this is reminiscent of the cladist's disregard for the sterile mule. Even if an organism has infinitely many 
descendants, it is still no guarantee of belonging to an infinitary species (see Section 6.1). Third, when an 
organism does have a species, it might not be unique: an individual may belong to multiple infinitary species 
at once (but the magnitude of this flaw is limited by Proposition [5]). These crucial flaws re-emphasize that 
while normal species are like nodes on the tree of life, infinitary species are more like "leaves at infinity." 

We compared and contrasted our new infinitary species notion with the internodal species concept of 
Hennig, as formalized by Kornet et al. To contrast the two, we compared how they performed on a particu- 
larly extreme population- a technique which we referred to as using a benchmark population, and which we 
hope might serve as a more general application of infinitary graph theory to the species problem, breaking 
it into more manageable species subproblems. 

We've exhibited some new theorems (among them, a dualization of the Knight-Darwin Law) which we 
find interesting in their own right. These theorems, like those of Kornet et al. (1995) before us, are of a less 
computational nature than a lot of mathematical theorems which have come from biology. We hope they 
may contribute to the two-way bridge between biology and the theoretical, philosophical, combinatorial side 
of mathematics. 
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A Proof of Theorem [12 

In this appendix we will prove Theorem [T2] (hereafter we assume its hypotheses) . First we review the 
definitions in Kornet et al. (1995). As we state them they are not equivalent; they are altered to fit the 
hypotheses of Theorem [T2J allowing us to simplify therrjfl 

Definition 5. For any u £ G, write IDYN(u) for the set {x £ G : u(PC> tt )j;} and write ^ (it) for 
{x £ G : t(x) > t(u)}. If u,v £ G and t(u) < t(v), define 

uINTv u(PC>> AVr[{ti(PC> u )rA (t(r) < t(v))} (BYN(u)n ^ (r) = DYN(r))]. 

If t(v) < t(u) then uINTv wINTu, and if t(v) = t(u) then u = v by our assumption that no individuals 
are born simultaneously, and we define mINTu. 

The main theorem of Kornet et al. (1995) implies INT is an equivalence relation. 

Definition 6. An internodon is an INT-equivalence class. 

We could now dive directly into the proof of Theorem [T^] but we prefer to factor out a sufficient condition 
which might be useful in the future for proving that other species notions decompose into internodons (this 
condition, too, will be generalized in future work). 

Proposition 13. (Sufficient conditions for internodons-decomposition) (Assuming the hypotheses of Theo- 
rem [12J) Let S C V. If S is undirancestrally closed, and for each u £ S and t £ R there is some u' £ S born 
after t with h(PC> m )m'; then S is a union of internodons. 

Proof. It suffices to let u £ S, v £ G, and show that if uINTu then v £ S. 

Case 1: v is born before u. Since zTNTu, in particular d(PC>„)u. Thus v is an undirancestor of u, 
putting v £ S by undirancestral closure. 

Case 2: v is born after u. Since itlNTv, by definition of INT we have u(PC> u )v and (*) for every r 
such that u(PC>„)r and t(r) < t{v), we have ©YN(u)n ^ (r) = DYN(r). 

8 To be precise, we are assuming that every individual has the SD property, which equates the equivalence relations INT 
and INTSD, as well as the sets DYN and GDYN, of the 1995 paper. 
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By the proposition's hypothesis, there is some u' £ S, born after v, such that u(PC> tt )ti'. Letting r = v 
in (*), we see (since u(PC> u )v and t(v) < t(v)) that DYN(u)n ^ (v) = DYN(w). Since u' 6 DYN(u)n ^ (t>), 
this shows u' e BYN(v), that is, !)(PC>„)u'. Thus t> is an undirancestor of u', so u G S by undirancestral 
closure of S. □ 

Proof of Theorem] 1 SI Let S* be an inspecies (type II). The first hypothesis of Proposition ITBI holds since S is 
undirancestrally closed by definition. The second hypothesis of Proposition[T3"lis immediate by Proposition[B] 

□ 

The reader will have noticed that despite imposing such onerous hypotheses on Theorem 1121 we barely 
seem to have actually used those hypotheses. Their key use was in simplifying the definition of internodons. 
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